neural graph
- North America > Canada > Ontario > Toronto (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Neural Graph Matching Improves Retrieval Augmented Generation in Molecular Machine Learning
Wang, Runzhong, Wang, Rui-Xi, Manjrekar, Mrunali, Coley, Connor W.
Molecular machine learning has gained popularity with the advancements of geometric deep learning. In parallel, retrieval-augmented generation has become a principled approach commonly used with language models. However, the optimal integration of retrieval augmentation into molecular machine learning remains unclear. Graph neural networks stand to benefit from clever matching to understand the structural alignment of retrieved molecules to a query molecule. Neural graph matching offers a compelling solution by explicitly modeling node and edge affinities between two structural graphs while employing a noise-robust, end-to-end neural network to learn affinity metrics. We apply this approach to mass spectrum simulation and introduce MARASON, a novel model that incorporates neural graph matching to enhance a fragmentation-based neural network. Experimental results highlight the effectiveness of our design, with MARASON achieving 28% top-1 accuracy, a substantial improvement over the non-retrieval state-of-the-art accuracy of 19%. Moreover, MARASON outperforms both naive retrieval-augmented generation methods and traditional graph matching approaches.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Asia > Singapore (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.92)
MBExplainer: Multilevel bandit-based explanations for downstream models with augmented graph embeddings
Golgoon, Ashkan, Franks, Ryan, Filom, Khashayar, Kannan, Arjun Ravi
In many industrial applications, it is common that the graph embeddings generated from training GNNs are used in an ensemble model where the embeddings are combined with other tabular features (e.g., original node or edge features) in a downstream ML task. The tabular features may even arise naturally if, e.g., one tries to build a graph such that some of the node or edge features are stored in a tabular format. Here we address the problem of explaining the output of such ensemble models for which the input features consist of learned neural graph embeddings combined with additional tabular features. We propose MBExplainer, a model-agnostic explanation approach for downstream models with augmented graph embeddings. MBExplainer returns a human-legible triple as an explanation for an instance prediction of the whole pipeline consisting of three components: a subgraph with the highest importance, the topmost important nodal features, and the topmost important augmented downstream features. A game-theoretic formulation is used to take the contributions of each component and their interactions into account by assigning three Shapley values corresponding to their own specific games. Finding the explanation requires an efficient search through the corresponding local search spaces corresponding to each component. MBExplainer applies a novel multilevel search algorithm that enables simultaneous pruning of local search spaces in a computationally tractable way. In particular, three interweaved Monte Carlo Tree Search are utilized to iteratively prune the local search spaces. MBExplainer also includes a global search algorithm that uses contextual bandits to efficiently allocate pruning budget among the local search spaces. We show the effectiveness of MBExplainer by presenting a set of comprehensive numerical examples on multiple public graph datasets for both node and graph classification tasks.
- Leisure & Entertainment > Games (0.68)
- Banking & Finance (0.67)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
- Energy > Oil & Gas > Upstream (0.46)
Accelerating Training with Neuron Interaction and Nowcasting Networks
Knyazev, Boris, Moudgil, Abhinav, Lajoie, Guillaume, Belilovsky, Eugene, Lacoste-Julien, Simon
Neural network training can be accelerated when a learnable update rule is used in lieu of classic adaptive optimizers (e.g. However, learnable update rules can be costly and unstable to train and use. A simpler recently proposed approach to accelerate training is to use Adam for most of the optimization steps and periodically, only every few steps, nowcast (predict future) parameters. We show that in some networks, such as Transformers, neuron connectivity is non-trivial. Recently, Jang et al. (2023); Sinha et al. (2017) showed that parameters θ follow a predictable trend Figure 1: More popular learnable approaches to speed up optimization, such as "learning to optimize" (L2O), are recurrently applied at every step t (Andrychowicz et al., 2016; Metz et al., 2022). This structure has been shown critical for many parameter representation tasks, such as property prediction (Navon et al., 2023; Zhou et al., 2023; Kofinas et al., 2024). We use Adam throughout the paper, but our discussion and methods are in principle applicable to any optimizers that produce a trajectory of parameters, including SGD with/without momentum, AdamW, Adagrad, etc.
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
- Asia > Middle East > Jordan (0.04)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks
Kofinas, Miltiadis, Knyazev, Boris, Zhang, Yan, Chen, Yunlu, Burghouts, Gertjan J., Gavves, Efstratios, Snoek, Cees G. M., Zhang, David W.
Neural networks that process the parameters of other neural networks find applications in domains as diverse as classifying implicit neural representations, generating neural network weights, and predicting generalization errors. However, existing approaches either overlook the inherent permutation symmetry in the neural network or rely on intricate weight-sharing patterns to achieve equivariance, while ignoring the impact of the network architecture itself. In this work, we propose to represent neural networks as computational graphs of parameters, which allows us to harness powerful graph neural networks and transformers that preserve permutation symmetry. Consequently, our approach enables a single model to learn from neural graphs with diverse architectures. How can we design neural networks that themselves take neural network parameters as input? This would allow us to make inferences about neural networks, such as predicting their generalization error (Unterthiner et al., 2020), generating neural network weights (Schürholt et al., 2022a), and classifying or generating implicit neural representations (Dupont et al., 2022) without having to evaluate them on many different inputs. For simplicity, let us consider a deep neural network with multiple hidden layers. As a naïve approach, we can simply concatenate all flattened weights and biases into one large feature vector, from which we can then make predictions as usual. However, this overlooks an important structure in the parameters: neurons in a layer can be reordered while maintaining exactly the same function (Hecht-Nielsen, 1990). Reordering neurons of a neural network means permuting the preceding and following weight matrices accordingly. Ignoring the permutation symmetry will typically cause this model to make different predictions for different orderings of the neurons in the input neural network, even though they represent exactly the same function. In general, accounting for symmetries in the input data improves the learning efficiency and underpins the field of geometric deep learning (Bronstein et al., 2021). Recent studies (Navon et al., 2023; Zhou et al., 2023a) confirm the effectiveness of equivariant layers for parameter spaces (the space of neural network parameters) with specially designed weight-sharing patterns. These weight-sharing patterns, however, require manual adaptation to each new architectural design. Importantly, a single model can only process neural network parameters for a single fixed architecture.
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Convexification of Neural Graph
Traditionally, most complex intelligence architectures are extremely non-convex, which could not be well performed by convex optimization. However, this paper decomposes complex structures into three types of nodes: operators, algorithms and functions. Iteratively, propagating from node to node along edge, we prove that "regarding the tree-structured neural graph, it is nearly convex in each variable, when the other variables are fixed." In fact, the non-convex properties stem from circles and functions, which could be transformed to be convex with our proposed \textit{\textbf{scale mechanism}}. Experimentally, we justify our theoretical analysis by two practical applications.
- Asia > China > Beijing > Beijing (0.04)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)